Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Jan 5, 2026

Motivation

This PR adds support for extending Nix using WASM (specifically wasmtime) in two ways:

  • It adds a new builtin function, builtins.wasm, that allows Nix functions to be written in any language that compiles to WASM. For example,

    builtins.wasm ./nix_wasm_plugin_fib.wasm "fib" 40

    The goal is to allow complex functionality (like YAML parsers) to be written in more convenient languages (like Rust) and executed faster, without having to resort to import-from-derivation.

    The WASM function can lazily traverse its argument and can produce arbitrary Nix values (except functions). Each WASM function call is executed as a fresh instance of the WASM VM, ensuring determinism (i.e. you can't use the WASM memory to pass state between function calls).

    See Initial bindings/examples for writing Nix functions in Rust nix-wasm-rust#1 for examples of Nix functions written in Rust.

  • It adds a new system type, wasm32-wasip1, that allows for system-independent derivations. For example:

    derivation {
      name = "wasi-test";
      system = "wasm32-wasip1";
      builder = ./nix-wasi-builder-hello.wasm;
      args = [ "World!" ];
    }

    Example here: https://github.com/DeterminateSystems/nix-wasi-rust

Context

Summary by CodeRabbit

Release Notes

  • New Features
    • Added WebAssembly (WASM) module support to nix expressions, enabling integration of WASM modules through a new primitive operation.
    • Introduced WASI (WebAssembly System Interface) support for building and executing WASM-based derivations.

✏️ Tip: You can customize this high-level summary in your review settings.

It adds ~400 MB to the Nix closure which is a lot.
Also, require WASM plugins to export a `nix_wasm_init_v1` function.
We don't want to reuse instances between WASM calls, since that's
impure: the WASM function could use some mutable global state to
return a different value every time. So instead, pre-instantiate
instances, allowing them to be instantiated efficiently.

Unfortunately, while this is much faster than not using
pre-instantiation, it is slower than reusing an instance: about 1.3s
compared to 0.7s for the call.nix test in nix-wasm-plugin-test (which
performs 100K WASM calls).
@edolstra edolstra marked this pull request as draft January 5, 2026 13:03
@coderabbitai
Copy link

coderabbitai bot commented Jan 5, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This pull request integrates Wasmtime and WebAssembly/WASI support into Nix. It adds Wasmtime as a packaged dependency, includes a WASM primop for executing WebAssembly modules from Nix expressions, implements a WASI derivation builder for running WASI workloads, and updates build configurations and platform checks accordingly.

Changes

Cohort / File(s) Summary
Packaging & Dependencies
packaging/dependencies.nix, packaging/wasmtime.nix
Adds Wasmtime v40.0.0 as a Nix package definition via rustPlatform.buildRustPackage, sourced from bytecodealliance/wasmtime. Wires Wasmtime into the dependencies scope. Outputs include lib, dev, and binary; postInstall handles library management, C API headers, and shell completions.
Expression Language – WASM Primop
src/libexpr/primops/wasm.cc, src/libexpr/primops/meson.build
Introduces a complete WASM integration module. Provides InstancePre, NixWasmInstancePre, and NixWasmInstance classes to manage Wasmtime engine, module loading, and instance state. Implements prim_wasm primop for calling WASM functions from Nix, with memory management, type conversions, and host function registration. Exposes panic/warn helpers and caches instances per path. Includes TOML-based primop registration.
Expression Language – Build Config
src/libexpr/meson.build, src/libexpr/package.nix, src/libexpr/primops.cc
Adds -lwasmtime linker flag to nixexpr library. Includes wasmtime in buildInputs. Changes realisePath from static to public linkage to expose it for WASM integration.
Store & Derivation – WASI Builder
src/libstore/unix/build/wasi-derivation-builder.cc, src/libstore/unix/build/derivation-builder.cc
Introduces WasiDerivationBuilder deriving from DerivationBuilderImpl. Initializes Wasmtime engine, configures WASI sandbox (stdin/stdout/stderr, argv, environment, preopened store directory), loads WASM module from builder path, and invokes _start export. Early-returns from makeDerivationBuilder for wasm32-wasip1 platform before platform-specific logic.
Store – Build Config & Platform Support
src/libstore/meson.build, src/libstore/derivation-options.cc
Adds -lwasmtime linker flag to nixstore library. Extends canBuildLocally to reject wasm32-wasip1 builds unless listed in extraPlatforms or isBuiltin.

Sequence Diagram(s)

sequenceDiagram
    participant User as Nix User<br/>(prim_wasm call)
    participant EvalState
    participant WasmCache as Instance<br/>Cache
    participant Wasmtime as Wasmtime<br/>Engine/Linker
    participant Module as WASM<br/>Module
    participant Instance as WASM<br/>Instance
    participant Memory as WASM<br/>Memory
    participant HostFns as Host<br/>Functions

    User->>EvalState: prim_wasm(path, entry, arg)
    EvalState->>WasmCache: lookup instance by path
    alt Instance exists
        WasmCache-->>EvalState: return cached NixWasmInstancePre
    else First time
        EvalState->>Wasmtime: create Engine & Linker
        EvalState->>HostFns: register regFuns
        EvalState->>Module: load module from path
        Wasmtime->>Module: instantiate with Linker
        Module->>Instance: create instance
        Instance->>Memory: access WASM memory
        WasmCache->>WasmCache: cache by path
        WasmCache-->>EvalState: return new NixWasmInstancePre
    end

    EvalState->>Instance: locate exported function (entry)
    Instance-->>EvalState: function ref
    EvalState->>EvalState: prepare arg as Value
    EvalState->>Memory: copy arg into WASM memory
    EvalState->>Instance: call function with memory id
    Instance->>HostFns: invoke host functions as needed
    HostFns->>Memory: read/write values
    Instance-->>EvalState: return result id
    EvalState->>Memory: read result from WASM memory
    EvalState-->>User: return as Nix Value
Loading
sequenceDiagram
    participant Store as Nix<br/>Store
    participant Builder as Build<br/>Dispatcher
    participant WasiBuilder as WASI<br/>Derivation<br/>Builder
    participant Wasmtime as Wasmtime<br/>Engine
    participant WasmModule as WASM<br/>Module
    participant WasiEnv as WASI<br/>Environment
    participant Instance as WASM<br/>Instance

    Store->>Builder: makeDerivationBuilder(drv: wasm32-wasip1)
    Builder->>Builder: check platform == wasm32-wasip1
    alt WASI Platform
        Builder->>WasiBuilder: instantiate for WASI build
        WasiBuilder->>Wasmtime: new Engine & Linker
        WasiBuilder->>WasiEnv: configure WASI<br/>(stdin, stdout, stderr,<br/>argv, env, preopened dirs)
        WasiBuilder->>WasmModule: load from drv.builder path
        Wasmtime->>Instance: instantiate module
        WasiBuilder->>Instance: locate _start export
        Instance->>Instance: verify is function
        WasiBuilder->>Instance: invoke _start() with no args
        Instance->>WasiEnv: interact with WASI calls
        Instance-->>WasiBuilder: execution complete
        WasiBuilder->>WasiBuilder: exit with code 0
        WasiBuilder-->>Store: build complete
    else Non-WASI Platform
        Builder->>Builder: use platform-specific builder
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Wasmtime hops into Nix with glee,
WASM modules dance wild and free,
WASI builds sprout from the code,
New pathways paved down the road,
From engine to instance we weave,
New magic for those who believe!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.44% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title "WASM support" is vague and generic, using a broad term that doesn't convey the specific implementation details or scope of the changes. Consider a more descriptive title that captures the main additions, such as "Add WebAssembly support via wasmtime integration and wasm32-wasip1 system type" or "Introduce builtins.wasm for calling WASM functions from Nix expressions."
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Fix all issues with AI Agents 🤖
In @src/libexpr/primops/wasm.cc:
- Around line 255-267: The code reads ValueId uint32_t values directly from WASM
memory in make_list, copy_list, make_attrset, and copy_attrset, assuming host
endianness; fix this by converting each 32-bit word from little-endian to host
order when constructing ValueId (e.g., apply le32toh or an equivalent portable
little-endian-to-host conversion) for every read from the memory subspan (the
vs[] accesses) so ValueId values are correct on big-endian hosts; update all
occurrences in the listed functions to perform this conversion before indexing
into values or assigning into lists/attrsets.
- Around line 334-350: In copy_attrname, replace the compile-time asserts with
runtime validations: check that attrIdx < attrs.size() and that len ==
name.size(); if either check fails, raise a runtime error (e.g., throw
std::runtime_error with a clear message mentioning valueId, attrIdx, len and
name.size()) or otherwise signal a trap/error path instead of proceeding; only
perform memcpy when both checks pass and keep the rest of the logic
(state.forceAttrs, obtaining name from state.symbols, and writing into memory())
unchanged.
- Around line 433-439: The static variable name `primop_fromTOML` is misleading
for the WASM primop registration; rename the variable to a descriptive
identifier like `primop_wasm` (i.e., change `static RegisterPrimOp
primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`) so it matches
the registered `.name = "wasm"` and `.fun = prim_wasm`; update any local
references to this static variable if present in the file to use the new name.
- Around line 309-332: The copy_attrset function writes attributes into WASM
memory using enumerate(*value.attrs()) which is non-deterministic; to fix,
gather the attributes into a temporary vector (e.g., vector of pairs or pointers
to attr), sort that vector lexicographically by the attribute name string via
state.symbols[attr.name], then iterate the sorted vector to populate
buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().
- Around line 411-418: The static unordered_map instancesPre is not thread-safe
and leaks entries; replace it with a thread-safe concurrent map (use
boost::concurrent_flat_map keyed by SourcePath) and store weak references
instead of owning refs so entries can be reclaimed (e.g., map value -> weak
pointer to NixWasmInstancePre or equivalent weak_ref). Update the lookup/insert
logic around instancesPre and the use of make_ref<NixWasmInstancePre>(wasmPath)
so you first attempt to lock the weak entry, create a new instance only if
expired, and store a weak reference in the concurrent_flat_map; optionally add
an eviction or LRU policy if unbounded growth remains a concern.
- Around line 49-53: The subspan function currently uses assert to check buffer
size which is omitted in release builds; replace the assert in
subspan(std::span<uint8_t> s, size_t len) with a runtime check that throws a
descriptive exception (e.g., std::out_of_range or std::length_error) when
s.size() < len * sizeof(T), include the required header ( <stdexcept> ), and
keep the returned span construction (use reinterpret_cast<T*>(s.data()) for
clarity) so callers get a safe failure instead of potential buffer overruns.
🧹 Nitpick comments (12)
src/libstore/derivation-options.cc (1)

399-400: Consider requiring a system feature for WASM support.

The special-casing of "wasm32-wasip1" allows any system to build WASM derivations, which aligns with the PR's goal of system-independent derivations. However, this assumes Wasmtime is always available.

Consider whether wasm32-wasip1 builds should require a system feature (e.g., "wasm" or "wasmtime") to ensure the runtime is actually available on the build machine. This would be checked by the existing getRequiredSystemFeatures logic at lines 406-408.

🔎 Alternative approach using system features

If you want to gate WASM builds on explicit system capability:

  1. Have wasm32-wasip1 derivations declare requiredSystemFeatures = ["wasm"];
  2. Remove the special case here, letting the normal system feature check at lines 406-408 handle it:
-    if (drv.platform != settings.thisSystem.get() && drv.platform != "wasm32-wasip1"
-        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
+    if (drv.platform != settings.thisSystem.get()
+        && !settings.extraPlatforms.get().count(drv.platform) && !drv.isBuiltin())
         return false;

This would make WASM support more explicit and consistent with how other cross-compilation scenarios are handled.

src/libstore/meson.build (1)

383-383: Consider using Meson's dependency system instead of hardcoded linker flags.

Directly appending -lwasmtime to link_args bypasses Meson's dependency resolution, which can lead to portability and configuration issues:

  • No verification that the wasmtime library is actually available
  • Library search paths may not be configured correctly
  • Platform-specific library naming conventions (e.g., .dll vs .so) are not handled
  • Build will fail with cryptic linker errors rather than clear dependency errors
🔎 Recommended approach using Meson dependency system

Near the top of the file with other dependencies (around line 109-161), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This provides better error messages, respects pkg-config paths, and handles platform differences automatically.

src/libexpr/meson.build (1)

242-242: Consider using Meson's dependency system instead of hardcoded linker flags.

This has the same portability concerns as the identical change in src/libstore/meson.build at line 383. Hardcoded -lwasmtime bypasses Meson's dependency resolution and can cause issues with library paths, platform-specific naming, and error reporting.

🔎 Recommended approach using Meson dependency system

Near the dependencies section (around lines 41-92), add:

+wasmtime = dependency('wasmtime', required : true)
+deps_private += wasmtime

Then remove the hardcoded linker flag:

-  link_args : linker_export_flags + [ '-lwasmtime' ],
+  link_args : linker_export_flags,

This approach provides better error messages, respects pkg-config configurations, and handles cross-platform differences automatically.

src/libstore/unix/build/derivation-builder.cc (1)

2033-2035: Early return for wasm32-wasip1 bypasses later sandbox / uid‑range checks

Routing params.drv.platform == "wasm32-wasip1" directly to WasiDerivationBuilder happens before:

  • The diverted store check (storeDir != realStoreDir) and
  • The !useSandbox && params.drvOptions.useUidRange(drv) guard.

That means for WASI builds:

  • uid-range requests will no longer be rejected when sandboxing is effectively disabled here.
  • Diverted stores on non‑Linux platforms will also be allowed for this path, unlike other builders.

If this is intentional (relying on the WASI runtime rather than OS sandboxing), consider making that explicit and either:

  • Reject uid-range in WasiDerivationBuilder’s path (or in this branch) so configuration errors are surfaced, or
  • Document that uid-range is ignored for wasm32-wasip1 and that OS sandboxing semantics differ from other systems.
src/libexpr/primops.cc (1)

168-188: Making realisePath non‑static for reuse is reasonable

Exposing SourcePath realisePath(...) (no longer static) with the optional SymlinkResolution parameter allows other primop modules (e.g. WASM/WASI ones) to share the same path + context resolution logic. The implementation is consistent with existing callers, and the default (Full) preserves previous behavior.

Just ensure there is a matching declaration in a header (e.g. primops.hh) so external uses stay type‑checked and the function remains the single canonical implementation.

src/libstore/unix/build/wasi-derivation-builder.cc (1)

5-18: Deduplicate unwrap / string2span helpers and tighten const‑correctness

Both unwrap and string2span appear to be cut‑and‑paste from src/libexpr/primops/wasm.cc, which means future changes to the Wasmtime API or error handling will have to be done in two places.

Consider:

  • Moving these helpers to a small shared header (e.g. a wasm-runtime-utils.hh) used by both primops and the WASI builder, and
  • Switching string2span to use a std::span<const uint8_t> (and avoiding the C‑style cast) if the Wasmtime C++ API accepts a const span, to keep const‑correctness and avoid accidental mutation.

Please double‑check the current wasmtime::Module::compile signature in your pinned Wasmtime version to confirm whether std::span<const uint8_t> is acceptable and adjust the helper accordingly.

src/libexpr/primops/wasm.cc (6)

6-6: Consider removing using namespace wasmtime; for clarity.

While acceptable in implementation files, explicitly qualifying wasmtime types improves code clarity and prevents potential name collisions with future additions to the nix namespace.


10-15: Refactor: Remove duplicate declaration and include proper header.

The realisePath function is already defined in src/libexpr/primops.cc (lines 167-187). Instead of forward-declaring it here, consider extracting it to a shared header file that both files can include, or making it part of the EvalState API.


94-103: Consider factory method for clearer error handling.

The constructor performs heavy operations (WASM compilation via unwrap()) that can throw. Consider using a static factory method that returns Result<ref<NixWasmInstancePre>, Error> for clearer error handling and easier testing.


182-196: Simplify error throwing pattern.

Line 196 uses an immediately-invoked lambda to throw an error, which is less clear than throwing directly. The ternary chain could also benefit from a switch statement for clarity.

🔎 Proposed refactor
 uint32_t get_type(ValueId valueId)
 {
     auto & value = *values.at(valueId);
     state.forceValue(value, noPos);
-    auto t = value.type();
-    return t == nInt        ? 1
-           : t == nFloat    ? 2
-           : t == nBool     ? 3
-           : t == nString   ? 4
-           : t == nPath     ? 5
-           : t == nNull     ? 6
-           : t == nAttrs    ? 7
-           : t == nList     ? 8
-           : t == nFunction ? 9
-                            : []() -> int { throw Error("unsupported type"); }();
+    switch (value.type()) {
+        case nInt: return 1;
+        case nFloat: return 2;
+        case nBool: return 3;
+        case nString: return 4;
+        case nPath: return 5;
+        case nNull: return 6;
+        case nAttrs: return 7;
+        case nList: return 8;
+        case nFunction: return 9;
+        default: throw Error("unsupported type");
+    }
 }

230-238: Consider adding documentation for two-phase copy pattern.

This function uses a common but potentially confusing pattern: it only copies data if the buffer is large enough, but always returns the actual size. A brief comment explaining that callers should check returnValue <= maxLen before using the buffer would improve clarity.


426-426: Improve readability by splitting complex expression.

Line 426 chains multiple operations in a single expression, making it difficult to read and debug. Consider breaking it into separate steps.

🔎 Proposed refactor
-        v = *instance.values.at(instance.runFunction(functionName, {(int32_t) instance.addValue(args[2])}).at(0).i32());
+        auto argId = instance.addValue(args[2]);
+        auto results = instance.runFunction(functionName, {(int32_t) argId});
+        auto resultId = results.at(0).i32();
+        v = *instance.values.at(resultId);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 51dacdd and 1eaba77.

📒 Files selected for processing (11)
  • packaging/dependencies.nix
  • packaging/wasmtime.nix
  • src/libexpr/meson.build
  • src/libexpr/package.nix
  • src/libexpr/primops.cc
  • src/libexpr/primops/meson.build
  • src/libexpr/primops/wasm.cc
  • src/libstore/derivation-options.cc
  • src/libstore/meson.build
  • src/libstore/unix/build/derivation-builder.cc
  • src/libstore/unix/build/wasi-derivation-builder.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libstore/derivation-options.cc (1)
src/libstore/include/nix/store/derivation-options.hh (2)
  • drv (184-184)
  • drv (201-201)
src/libstore/unix/build/wasi-derivation-builder.cc (1)
src/libexpr/primops/wasm.cc (7)
  • unwrap (20-25)
  • unwrap (20-20)
  • v (131-136)
  • v (131-131)
  • v (138-143)
  • string2span (38-41)
  • string2span (38-38)
src/libexpr/primops/wasm.cc (1)
src/libexpr/primops.cc (4)
  • state (3218-3224)
  • state (3218-3218)
  • realisePath (168-188)
  • realisePath (168-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (6)
src/libexpr/primops/meson.build (1)

12-12: LGTM!

Adding wasm.cc to the primops sources is straightforward and correct.

src/libexpr/package.nix (1)

17-17: LGTM!

The wasmtime dependency is correctly added following the same pattern as toml11 and other build-time dependencies.

Also applies to: 68-68

packaging/dependencies.nix (1)

78-79: Wiring wasmtime into the dependency scope looks correct

wasmtime = pkgs.callPackage ./wasmtime.nix { }; is consistent with how other custom packages are exposed here; no further changes needed.

src/libstore/unix/build/derivation-builder.cc (2)

294-298: Naming comment on realPathInSandbox

The // FIXME: misnomer note correctly reflects that this helper currently just resolves store paths rather than honoring a sandbox path mapping. No functional impact, so this can remain as a low‑priority cleanup marker.


1996-2001: Including wasi-derivation-builder.cc alongside other platform builders is consistent

Adding #include "wasi-derivation-builder.cc" next to the other platform-specific builder includes matches the existing “single TU with included .cc files” pattern. No issues from an integration standpoint.

packaging/wasmtime.nix (1)

1-121: Wasmtime packaging looks solid; minor note on header-copy path robustness

Overall this derivation is well-structured:

  • Uses a pinned Rust toolchain (rust_1_89) and buildRustPackage.
  • Restricts doCheck to platforms with the required SIMD features and skips problematic aarch64 runs.
  • Splits outputs into out, dev, and lib, pruning .so / .a according to enableShared / enableStatic.
  • Handles shell completions correctly for both native and cross builds.
  • Adds an install-time --version check and an updateScript with an appropriate version regex.

One thing to keep in mind is the header copy:

cp -r target/${cargoShortTarget}/release/build/wasmtime-c-api-impl-*/out/include $dev/include

This relies on the exact crate/build directory naming and release profile path; if upstream layout or build profile changes, this will break somewhat opaquely. It may be worth:

  • Asserting that exactly one wasmtime-c-api-impl-* match exists, or
  • Failing with a clearer message if the glob expands to zero or multiple directories.

Otherwise the packaging is in good shape.

Please double-check against your pinned Wasmtime 40.0.0 sources that the wasmtime-c-api-impl-* build output layout (and the cargoShortTarget/release path) is stable; if upstream changes it, the cp -r .../out/include step will be the first thing to break.

Comment on lines +49 to +53
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
assert(s.size() >= len * sizeof(T));
return std::span((T *) s.data(), len);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Replace assert with proper error handling.

The assert on line 51 is compiled out in release builds, which could lead to buffer overruns if the size check fails. Use a runtime check that throws an exception instead.

🔎 Proposed fix
 template<typename T>
 static std::span<T> subspan(std::span<uint8_t> s, size_t len)
 {
-    assert(s.size() >= len * sizeof(T));
+    if (s.size() < len * sizeof(T))
+        throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size());
     return std::span((T *) s.data(), len);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
assert(s.size() >= len * sizeof(T));
return std::span((T *) s.data(), len);
}
template<typename T>
static std::span<T> subspan(std::span<uint8_t> s, size_t len)
{
if (s.size() < len * sizeof(T))
throw Error("buffer too small: expected at least %d bytes, got %d", len * sizeof(T), s.size());
return std::span((T *) s.data(), len);
}
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 49-53, The subspan function
currently uses assert to check buffer size which is omitted in release builds;
replace the assert in subspan(std::span<uint8_t> s, size_t len) with a runtime
check that throws a descriptive exception (e.g., std::out_of_range or
std::length_error) when s.size() < len * sizeof(T), include the required header
( <stdexcept> ), and keep the returned span construction (use
reinterpret_cast<T*>(s.data()) for clarity) so callers get a safe failure
instead of potential buffer overruns.

Comment on lines +255 to +267
ValueId make_list(uint32_t ptr, uint32_t len)
{
auto vs = subspan<ValueId>(memory().subspan(ptr), len);

auto [valueId, value] = allocValue();

auto list = state.buildList(len);
for (const auto & [n, v] : enumerate(list))
v = values.at(vs[n]); // FIXME: endianness
value.mkList(list);

return valueId;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Address endianness assumption for cross-platform correctness.

The FIXME on line 263 notes that ValueIds are read from WASM memory without endianness conversion. WASM uses little-endian, so this code will fail on big-endian host systems. While big-endian systems are rare, this violates Nix's goal of deterministic, cross-platform builds.

The same issue appears in copy_list (line 278), make_attrset (line 296), and copy_attrset (line 322).

Consider using explicit little-endian conversion functions (e.g., le32toh) when reading multi-byte values from WASM memory.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 255-267, The code reads ValueId
uint32_t values directly from WASM memory in make_list, copy_list, make_attrset,
and copy_attrset, assuming host endianness; fix this by converting each 32-bit
word from little-endian to host order when constructing ValueId (e.g., apply
le32toh or an equivalent portable little-endian-to-host conversion) for every
read from the memory subspan (the vs[] accesses) so ValueId values are correct
on big-endian hosts; update all occurrences in the listed functions to perform
this conversion before indexing into values or assigning into lists/attrsets.

Comment on lines +309 to +332
uint32_t copy_attrset(ValueId valueId, uint32_t ptr, uint32_t maxLen)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attrset into WASM");

if (value.attrs()->size() <= maxLen) {
// FIXME: endianness.
struct Attr
{
ValueId value;
uint32_t nameLen;
};

auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);

// FIXME: for determinism, we should return attributes in lexicographically sorted order.
for (const auto & [n, attr] : enumerate(*value.attrs())) {
buf[n].value = addValue(attr.value);
buf[n].nameLen = state.symbols[attr.name].size();
}
}

return value.attrs()->size();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix: Ensure deterministic attribute ordering.

The FIXME on line 324 correctly identifies that attributes should be returned in lexicographically sorted order for determinism. Nix's evaluation model requires consistent attribute ordering across evaluations.

🔎 Proposed fix
         if (value.attrs()->size() <= maxLen) {
             // FIXME: endianness.
             struct Attr
             {
                 ValueId value;
                 uint32_t nameLen;
             };
 
             auto buf = subspan<Attr>(memory().subspan(ptr), maxLen);
 
-            // FIXME: for determinism, we should return attributes in lexicographically sorted order.
-            for (const auto & [n, attr] : enumerate(*value.attrs())) {
+            // Collect and sort attributes by name for determinism
+            std::vector<std::pair<std::string_view, const Attr*>> sortedAttrs;
+            for (const auto & attr : *value.attrs()) {
+                sortedAttrs.push_back({state.symbols[attr.name], &attr});
+            }
+            std::sort(sortedAttrs.begin(), sortedAttrs.end(),
+                [](const auto & a, const auto & b) { return a.first < b.first; });
+            
+            for (const auto & [n, pair] : enumerate(sortedAttrs)) {
+                const auto & attr = *pair.second;
                 buf[n].value = addValue(attr.value);
                 buf[n].nameLen = state.symbols[attr.name].size();
             }
         }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 309-332, The copy_attrset function
writes attributes into WASM memory using enumerate(*value.attrs()) which is
non-deterministic; to fix, gather the attributes into a temporary vector (e.g.,
vector of pairs or pointers to attr), sort that vector lexicographically by the
attribute name string via state.symbols[attr.name], then iterate the sorted
vector to populate buf[n].value = addValue(attr.value) and buf[n].nameLen =
state.symbols[attr.name].size(), preserving the existing buffer handling
(subspan/Attr/maxLen) and return value.attrs()->size().

Comment on lines +334 to +350
std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len)
{
auto & value = *values.at(valueId);
state.forceAttrs(value, noPos, "while copying an attr name into WASM");

auto & attrs = *value.attrs();

assert((size_t) attrIdx < attrs.size());

std::string_view name = state.symbols[attrs[attrIdx].name];

assert((size_t) len == name.size());

memcpy(memory().subspan(ptr, len).data(), name.data(), name.size());

return {};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Replace assertions with runtime checks.

Lines 341 and 345 use assert statements that are compiled out in release builds. This can lead to:

  • Line 341: Out-of-bounds array access if attrIdx >= attrs.size()
  • Line 345: Buffer overflow if len != name.size()
🔎 Proposed fix
 std::monostate copy_attrname(ValueId valueId, uint32_t attrIdx, uint32_t ptr, uint32_t len)
 {
     auto & value = *values.at(valueId);
     state.forceAttrs(value, noPos, "while copying an attr name into WASM");
 
     auto & attrs = *value.attrs();
 
-    assert((size_t) attrIdx < attrs.size());
+    if ((size_t) attrIdx >= attrs.size())
+        throw Error("attribute index %d out of bounds (size: %d)", attrIdx, attrs.size());
 
     std::string_view name = state.symbols[attrs[attrIdx].name];
 
-    assert((size_t) len == name.size());
+    if ((size_t) len != name.size())
+        throw Error("buffer size mismatch: expected %d bytes, got %d", name.size(), len);
 
     memcpy(memory().subspan(ptr, len).data(), name.data(), name.size());
 
     return {};
 }
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 334-350, In copy_attrname, replace
the compile-time asserts with runtime validations: check that attrIdx <
attrs.size() and that len == name.size(); if either check fails, raise a runtime
error (e.g., throw std::runtime_error with a clear message mentioning valueId,
attrIdx, len and name.size()) or otherwise signal a trap/error path instead of
proceeding; only perform memcpy when both checks pass and keep the rest of the
logic (state.forceAttrs, obtaining name from state.symbols, and writing into
memory()) unchanged.

Comment on lines +411 to +418
// FIXME: make thread-safe.
// FIXME: make this a weak Boehm GC pointer so that it can be freed during GC.
static std::unordered_map<SourcePath, ref<NixWasmInstancePre>> instancesPre;

auto instancePre = instancesPre.find(wasmPath);
if (instancePre == instancesPre.end())
instancePre = instancesPre.emplace(wasmPath, make_ref<NixWasmInstancePre>(wasmPath)).first;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if EvalState or prim_* functions are used in concurrent contexts
rg -n "std::thread|std::async|pthread" --type=cpp -C 2 | head -50

# Check for existing thread-safety patterns in primops
rg -n "std::mutex|std::lock_guard|std::atomic" src/libexpr/ --type=cpp -C 2 | head -50

Repository: DeterminateSystems/nix-src

Length of output: 5273


🏁 Script executed:

#!/bin/bash
# Look at the actual file context around lines 411-418
cat -n src/libexpr/primops/wasm.cc | sed -n '400,430p'

Repository: DeterminateSystems/nix-src

Length of output: 1640


🏁 Script executed:

#!/bin/bash
# Check how EvalState handles concurrent evaluation
rg -n "EvalState.*thread|thread.*EvalState" --type=cpp -C 3 | head -40

# Look for parallel evaluation support
rg -n "parallel|concurrent|thread-safe" src/libexpr/ --type=cpp | grep -i "eval\|prim" | head -20

Repository: DeterminateSystems/nix-src

Length of output: 2211


Fix thread-unsafe static cache in prim_wasm.

The static instancesPre map on line 413 is not thread-safe and will race under parallel evaluation. Nix supports parallel evaluation (see builtins.parallel and parallel-eval.cc), making this a critical issue.

The cache grows unbounded and is never freed. Use boost::concurrent_flat_map (already used elsewhere in EvalState for srcToStore and importResolutionCache) and consider weak pointers or LRU eviction to prevent memory bloat.

🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 411-418, The static unordered_map
instancesPre is not thread-safe and leaks entries; replace it with a thread-safe
concurrent map (use boost::concurrent_flat_map keyed by SourcePath) and store
weak references instead of owning refs so entries can be reclaimed (e.g., map
value -> weak pointer to NixWasmInstancePre or equivalent weak_ref). Update the
lookup/insert logic around instancesPre and the use of
make_ref<NixWasmInstancePre>(wasmPath) so you first attempt to lock the weak
entry, create a new instance only if expired, and store a weak reference in the
concurrent_flat_map; optionally add an eviction or LRU policy if unbounded
growth remains a concern.

Comment on lines +433 to +439
static RegisterPrimOp primop_fromTOML(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function with the specified argument.
)",
.fun = prim_wasm});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix misleading variable name.

The variable is named primop_fromTOML but actually registers the wasm primop. This appears to be a copy-paste error from another primop definition.

🔎 Proposed fix
-static RegisterPrimOp primop_fromTOML(
+static RegisterPrimOp primop_wasm(
     {.name = "wasm",
      .args = {"wasm", "entry", "arg"},
-     .doc = R"(
-      Call a WASM function with the specified argument.
-     )",
+     .doc = R"(
+       Call a WASM function from a WebAssembly module.
+       
+       Arguments:
+       - wasm: Path to the .wasm file
+       - entry: Name of the exported function to call
+       - arg: Argument to pass to the function (Nix value)
+       
+       Returns the result from the WASM function as a Nix value.
+       The WASM module must export a `nix_wasm_init_v1` function.
+     )",
      .fun = prim_wasm});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static RegisterPrimOp primop_fromTOML(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function with the specified argument.
)",
.fun = prim_wasm});
static RegisterPrimOp primop_wasm(
{.name = "wasm",
.args = {"wasm", "entry", "arg"},
.doc = R"(
Call a WASM function from a WebAssembly module.
Arguments:
- wasm: Path to the .wasm file
- entry: Name of the exported function to call
- arg: Argument to pass to the function (Nix value)
Returns the result from the WASM function as a Nix value.
The WASM module must export a `nix_wasm_init_v1` function.
)",
.fun = prim_wasm});
🤖 Prompt for AI Agents
In @src/libexpr/primops/wasm.cc around lines 433-439, The static variable name
`primop_fromTOML` is misleading for the WASM primop registration; rename the
variable to a descriptive identifier like `primop_wasm` (i.e., change `static
RegisterPrimOp primop_fromTOML(...` to `static RegisterPrimOp primop_wasm(...`)
so it matches the registered `.name = "wasm"` and `.fun = prim_wasm`; update any
local references to this static variable if present in the file to use the new
name.

Comment on lines 29 to 72
void execBuilder(const Strings & args, const Strings & envStrs) override
{
using namespace wasmtime;

Engine engine;
Linker linker(engine);
unwrap(linker.define_wasi());

WasiConfig wasiConfig;
wasiConfig.inherit_stdin();
wasiConfig.inherit_stdout();
wasiConfig.inherit_stderr();
wasiConfig.argv(std::vector(args.begin(), args.end()));
{
std::vector<std::pair<std::string, std::string>> env2;
for (auto & [k, v] : env)
env2.emplace_back(k, rewriteStrings(v, inputRewrites));
wasiConfig.env(env2);
}
if (!wasiConfig.preopen_dir(
store.config->realStoreDir.get(),
store.storeDir,
WASMTIME_WASI_DIR_PERMS_READ | WASMTIME_WASI_DIR_PERMS_WRITE,
WASMTIME_WASI_FILE_PERMS_READ | WASMTIME_WASI_FILE_PERMS_WRITE))
throw Error("cannot add store directory to WASI config");
// FIXME: add temp dir

auto module = unwrap(Module::compile(engine, string2span(readFile(realPathInSandbox(drv.builder)))));
wasmtime::Store wasmStore(engine);
unwrap(wasmStore.context().set_wasi(std::move(wasiConfig)));
auto instance = unwrap(linker.instantiate(wasmStore, module));

auto startName = "_start";
auto ext = instance.get(wasmStore, startName);
if (!ext)
throw Error("WASM module '%s' does not export function '%s'", drv.builder, startName);
auto fun = std::get_if<Func>(&*ext);
if (!fun)
throw Error("export '%s' of WASM module '%s' is not a function", startName, drv.builder);

unwrap(fun->call(wasmStore.context(), {}));

_exit(0);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd src/libstore/unix/build && wc -l wasi-derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 99


🏁 Script executed:

cd src/libstore/unix/build && head -80 wasi-derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2592


🏁 Script executed:

rg -n "class.*DerivationBuilder" src/libstore/unix/build/ -A 5

Repository: DeterminateSystems/nix-src

Length of output: 510


🏁 Script executed:

rg -n "initEnv" src/libstore/unix/build/ -A 10

Repository: DeterminateSystems/nix-src

Length of output: 2704


🏁 Script executed:

rg -n "runChild" src/libstore/unix/build/build-user.cc -A 15

Repository: DeterminateSystems/nix-src

Length of output: 147


🏁 Script executed:

rg -n "runChild" src/libstore/ -B 2 -A 15

Repository: DeterminateSystems/nix-src

Length of output: 9282


🏁 Script executed:

sed -n '1057,1150p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 4230


🏁 Script executed:

rg -n "void.*execBuilder" src/libstore/ -B 5 -A 5

Repository: DeterminateSystems/nix-src

Length of output: 3761


🏁 Script executed:

sed -n '1276,1360p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2796


🏁 Script executed:

rg -n "tmpDirInSandbox" src/libstore/unix/build/ -A 3

Repository: DeterminateSystems/nix-src

Length of output: 4675


🏁 Script executed:

sed -n '1340,1390p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 2017


🏁 Script executed:

rg -n "for.*env\|envStrs" src/libstore/unix/build/derivation-builder.cc -A 3 -B 2 | grep -A 5 "1340\|1350\|1360\|1370"

Repository: DeterminateSystems/nix-src

Length of output: 52


🏁 Script executed:

sed -n '1350,1380p' src/libstore/unix/build/derivation-builder.cc

Repository: DeterminateSystems/nix-src

Length of output: 1016


Address unused envStrs parameter and duplicate env rewriting

The envStrs parameter is computed in runChild() by iterating over env and applying rewriteStrings() to each key=value pair, then passed to execBuilder(). The WASI override ignores this and rebuilds env2 using identical logic. Reuse envStrs by parsing it to avoid duplication and ensure consistency if env handling changes.

Preopen the temp directory and adjust store permissions

initEnv() sets TMPDIR, NIX_BUILD_TOP, and PWD to tmpDirInSandbox(), but only the store directory is preopened in the WASI config. WASM builders attempting to write to $TMPDIR or access $NIX_BUILD_TOP will fail. Additionally, the store is preopened with READ|WRITE permissions on all files; typical chroot builds expose stores read-only, with only outputs writable. Add a preopen for the temp directory and consider making the store read-only while exposing outputs or scratch locations separately to match existing sandbox assumptions.

@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 15:17 Inactive
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

This function was named incorrectly. It denotes the path in the host
filesystem, not in the sandbox.
@github-actions github-actions bot temporarily deployed to pull request January 5, 2026 17:20 Inactive
Even though it's in a separate output, disabling it reduces the size
of libwasmtime.so by around 12 MB.
This reduces the size of the Nix closure by another 12 MB.
@github-actions github-actions bot temporarily deployed to pull request January 6, 2026 14:08 Inactive
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a copyright notice that points back to nixpkgs, since this is a verbatim copy of the package definition from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants